XMapGen and XMapSiG results for OAEI 2013
نویسندگان
چکیده
The XMapGen and XMapSig systems are flexible and self-configuring matching tools using different strategies for combining multiple similarity measures into a single aggregated metric with the final aim of improving the ontology alignment quality of large scale ontologies. XMapGen and XMapSig are two variants of XMap++. The results obtained by the two ontology matching tools within the 9th edition of the Ontology Alignment Evaluation Initiative (OAEI 2013) campaign are therefore presented. 1 Presentation of the system We present a fully automatic general purpose ontology alignment tools called XMapGen (eXtensible Mapping using Genetic) and XMapSig (eXtensible Mapping using Sigmoid), a new and lighter implementations of their ancestor XMap++ [1]. XMapGen and XMapSig include several matchers. These matchers calculate similarities between the terms from the different source ontologies. The matchers implement strategies based on linguistic matching, structure-based strategies and strategies that use auxiliary information in the thesaurus WordNet to enhance the alignment process. XMapGen uses Genetic Algorithm (GA) as a machine learning-based method to ascertain how to combine multiple similarity measures into a single aggregated metric with the final aim of improving the ontology alignment quality. XMapSig uses sigmoid function [4] for combining the corresponding weights for different semantic aspects, reflecting their different importance. This year, XMapGen and XMapSig participate in five tracks including Benchmark, Conference, Library, Anatomy and Large Biomedical Ontologies tracks. 1.1 State, purpose, general statement XMapGen and XMapSig are a scalable ontology alignment tools capable of matching English language ontologies described in different OWL languages (i.e., OWL Lite, OWL DL, and OWL Full). The major principle of the matching strategy in XMapGen and XMapSig approaches is combining multiple similarity measures into a single similarity metric using weights determined by intelligent strategies in order to skip over the burden of manual selection. Despite the impressive strategy in adding GA, aligning medium-sized and large-scale ontologies is still very time consuming and computationally expensive. This inspires us to consider the use of a particular parallel matching on multiple cores or machines for dealing with the scalability issue on ontology matching. 1.2 Specific techniques used In this section, the workflow of XMap++ and its main components is briefly described and shown in Fig.1. Both systems XMapGen and XMapSig calculate three different basic measures to create three similarity matrixes. String-based, semantic and structural methods are the three different categories of measuring similarities. Fig. 1. Sketch of Architecture for XMAP++. In XMap++ approach, a generic workflow for a given ontology matching scenario is as follows: 1. Matching inputs are two ontologies, source O and target O ′ parsed by an Ontology Parser component; 2. The String Matcher based on linguistic matching compares the textual descriptions of the concepts associated with the nodes (labels, names) of each ontology; 3. The Linguistic matcher jointly aims at identifying words in the input strings, relaying on WordNet [5] which provide additional information towards unveiling mappings in cases where features such as labels are missing or in cases where names are replaced by random strings. These matching techniques may provide incorrect match candidates, structural matching is used to correct such match candidates based on their structural context. In order to deal with lexical ambiguity, we introduce the notion of scope belonging to a concept which represents the context where it is placed. In our approach, the similarity between two entities of different ontologies is evaluated not only by investigating the semantics of the entities names, but also taking into account the local context, through which the effective meaning is described. In particular, the neighborhood of a term (immediate parent and children in the is-a hierarchy). Increasing the radius means enlarging the scope (i.e. this area) and, consequently, the set of neighbour concepts that intervene in the description of the context. The value of linguistic methods is added to the linguistic matcher or the structure matcher in order to enhance the semantic ambiguity during the comparison process of entity names; 4. The structural matcher aligns nodes based on their adjacency relationships. The relationships (e.g., subClassOf and is-a) that are frequently used in the ontology serve, at one hand, as the foundation of the structural matching. On the other hand, the structural rules are used to extract the ontological context of each node, up to a certain depth (radius). This context includes some of its neighbours, where each of them is associated a weight representing the importance it has when evaluating the contextual node. The XMap++ algorithm values the semantic relation between two concepts while taking in consideration the types of cardinality constraints (e.g. OWLAllValuesFrom, OWLSomeValuesFrom, OWLMinCardinality, OWL-Cardinality, OWLMaxCardinality, Same as or Kind of ) and values between their properties (e.g. OWLMaxCardinality >=1). Alignment suggestions are then determined by combining and filtering the results generated by one or more matchers; 5. The three matchers perform similarity computation in which each entity of the source ontology is compared with all the entities of the target ontology, thus producing three similarity matrices, which contain a value for each pair of entities. After that, an aggregation operator is used to combine multiple similarity matrices computed by different matchers to a single aggregated n ∗m similarity matrix, where n is the number of element in the source ontology and m is the number of elements in the target ontology. We refer to [1] for more detail about the pruning and splitting techniques on data matrices for two couple of entities; 6. XMap++ uses three types of aggregation operator; these strategies are aggregation, selection and combination. The aggregation reduces the similarity cube to a matrix, by aggregating all matcher’s results matrices into one. This aggregation is defined by five strategies: Max, Min, Average, sigmoid function and Weighted. The Max strategy is an optimistic one, selecting the highest similarity value calculated by any matcher. Contrary, the Min strategy selects the lowest value. Average evens out the matcher results, calculating the average. The sigmoid method combines multiple results using a sigmoid methods, which is essentially a smoothed threshold function [4]. In order to satisfy a different importance of matcher results, Weighted computes the weighted sum of the results, according to user defined weights or automatic defined weights using a dynamic strategy [3], using an Artificial Neural Network (ANN) (Djeddi and Khadir, 2013) or using Genetic Algorithm (GA); 7. Finally, these values are filtered using a selection according to a defined threshold and the desired cardinality. In our algorithm, we adopt the 1-1 cardinality to find the optimal solution in polynomial time. 1.3 Adaptations made for the evaluation Several technical adaptations were required for integrating the system into the Seals platform, such as: – Updating some libraries (e.g., Alignment API) or changing the way some parameters are communicated. – To deal with large ontologies, XMapGen and XMapSig conducted specific experiments to see whether a matching system can exploit a multi-core architecture [6] to speed up the matching process. We adapted parallel matching to the use of threading to distribute the jobs of two matchers (Classes matcher and Properties matcher) on all available CPU cores on only one machine. – There are two factors that directly impact to the systems’ performance. The first ones relates to matching by machine learning model. The training data and selected similarity metrics as learning attributes are important. A simple solution for this issue is proposed by selecting the most appropriate similarity metrics and training data according to their correlation with expert’s assessment. The second issue relates to the threshold used as a filter in the selection module. Different tests require different thresholds. – In XMap++, the aim of the Structural Matcher is to correct such match candidates based on their structural context. The structural approach matches the nodes based on their adjacency relationships. XMapGen and XMapSig exploit only the superclass-subclass relationships (subsumption relationships) that are frequently used in ontologies when the total number of entities is bigger than 1500 entities in each ontology. We restrict the contextual similarity computing; only the value of the semantic relation between two concepts without taking in consideration the types of cardinality constraints and values between their properties, because if the ontologies became larger, the efficiency of the automatic alignment methods decreases considerably, in term of execution time, and memory size.
منابع مشابه
Results of the Ontology Alignment Evaluation Initiative 2013
Ontology matching consists of finding correspondences between semantically related entities of two ontologies. OAEI campaigns aim at comparing ontology matching systems on precisely defined test cases. These test cases can use ontologies of different nature (from simple thesauri to expressive OWL ontologies) and use different modalities, e.g., blind evaluation, open evaluation and consensus. OA...
متن کاملXMap++: results for OAEI 2014
In this paper, we present the results obtained by our ontology matching system XMap++ within the OAEI 2014 campaign. XMap++ is a scalable ontology alignment tools capable of matching large scale ontology. This is our second participation in the OAEI, and we can see an overall improvement on nearly every task. 1 State, purpose, general statement XMap (eXtensible Mapping) is an ontology alignment...
متن کاملAutomating OAEI Campaigns
This paper reports the first effort into integrating OAEI and SEALS evaluation campaigns. OAEI is an annual evaluation campaign for ontology matching systems. The 2010 campaign includes a new modality in coordination with the SEALS project. This project aims at providing standardized resources (software components and data sets) for automatically executing evaluations of typical semantic web to...
متن کاملIs my ontology matching system similar to yours?
The quality of the mappings computed by an ontology matching system in the Ontology Alignment Evaluation Initiative (OAEI) [2, 1] is typically measured in terms of precision and recall with respect to a reference set of mappings. Additionally, the OAEI also evaluates the coherence of the computed mappings [1]. However, the differences and similarities among the mappings computed by different sy...
متن کاملCroLOM results for OAEI 2017: summary of cross-lingual ontology matching systems results at OAEI
This paper presents the results obtained in the OAEI 2017 campaign by our ontology matching system CroLOM. CroLOM is an automatic system especially designed for aligning multilingual ontologies. This is our second participation with CroLOM in the OAEI and the results have so far been positive.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013